Comparison of free energy estimators and their dependence on dissipated work 
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The estimate of free energy changes based on Bennett's acceptance ratio method is examined in 
several limiting cases and compared with other estimates based on the Jarzynski equality and on the 
Crooks relation. While the absolute amount of dissipated work, defined as the surplus of average 
work over the free energy difference, limits the practical applicability of Jarzynski's and Crooks' 
methods, the reliability of Bennett's approach is restricted by the difference of the dissipated works 
in the forward and the backward process. We illustrate these points by considering a Gaussian 
chain and a hairpin chain which both are extended during the forward and accordingly compressed 
during the backward protocol. The reliability of the Crooks relation predominantly depends on 
the sample size; for the Jarzynski estimator the slowness of the work protocol is crucial, and the 
Bennett method is shown to give precise estimates irrespective of the pulling speed and sample 
size as long as the dissipated works are the same for the forward and the backward process as 
it is the case for Gaussian work distributions. With an increasing dissipated work difference the 
Bennett estimator also acquires a bias which increases roughly in proportion to this difference. A 
substantial simplification of the Bennett estimator is provided by the 1/2- formula which expresses 
the free energy difference by the algebraic average of the Jarzynski estimates for the forward and 
the backward processes. It agrees with the Bennett estimate in all cases when the Jarzynski and 
the Crooks estimates fail to give reliable results. 

PACS numbers: 05.70.Ln, 05.40.-a, 05.20.-y 



I. INTRODUCTION 

In studying the thermodynamic state of a physical sys- 
tem, the free energy F is a quantity of fundamental im- 
portance. It describes the equilibrium properties of sys- 
tems that may exchange energy with their environments. 
Formally, it is related to the internal energy, U, by a 
Lcgcndre transform F = U — TS, where T is the tem- 
perature and S the entropy. The free energy is a state 
function and hence, for any process connecting two equi- 
librium states, the respective change of the free energy 
AF = AU — TAS, solely depends on the final and initial 
states without regard to the particular process connect- 
ing them. In contrast, the work w done on the system and 
the heat Q exchanged with the environment are process- 
dependent. Yet, their sum yielding the change in internal 
energy, AU = w + Q, does not depend on the details of 
the path connecting the final with the initial state. 

Recently, Jarzynski found a relation between the path 
dependent work and the path independent free energy 
change in terms of the following sum rule [l[ , 



dwp{w)e~ fiw = (e~ Pw ) 



-/3AF 



(1) 



where p(w) denotes the probability density function 
(PDF) of the work that is performed on the system. The 
process from which this work results, starts out in a state 
of thermal equilibrium at temperature T = (fcg/3) , and 
is induced by the action of forces, or more generally by 
changes of parameters characterizing the Hamiltonian of 



the considered system. These parameter changes are sup- 
posed to follow a specific protocol, on the details of which 
the work PDF will depend in general. 

In Eq. <[T|), AF = Ff — F i; denotes the difference be- 
tween the free energies Fi and Ff of the initial thermal 
equilibrium state and the thermal equilibrium state of 
the system with the final parameter values, respectively. 
Both equilibrium states are at the same temperature T. 
In general, the second equilibrium state differs from the 
actual state that is reached at the end of the protocol. 
In principle, this "associated thermal equilibrium" will 
be approached when the system stays in contact with a 
heat bath at temperature T upon completion of the pro- 
tocol. As a difference of a state function, AF depends on 
the initial and final parameter values but is independent 
of the details of the protocol. 

The random nature of the work manifested in the PDF 
p(w) is a consequence of the inherent randomness of the 
thermal initial state and additionally also due to a pos- 
sible randomness of the dynamics, be it of quantum or 
classical, stochastic nature. In the latter case, random- 
ness and dissipation must be properly balanced by fluc- 
tuation dissipation relations eventually imposing thermal 
equilibrium at the initial inverse temperature j3 at con- 
stant parameter values. Any application of the Jarzynski 
equality ([1} to experiments hence requires repeating ex- 
periments with the same protocol many times in order to 
generate a sufficient statistics. The same requirement to 
generate sufficiently many data must also be met in nu- 
merical implementations of the Jarzynski equality aiming 
at the determination of free energy changes. 
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The feasibility of this scheme with the goal to de- 
termine the free energy change AF was demonstrated 
in various experimental systems such as for single 
molecules @, y| and classical oscillators [1, [||. Yet 
the practical applicability of the Jarzynski equality is 
severely limited because the estimate of the exponen- 
tial work average strongly relies on how well the tail 
of the work distribution with w < AF is sampled @. 
In general the exponential average {e~@ w ) gives rise to 
a bias of the Jarzynski free energy estimate [f| . It is 
now well recognized that the best convergence can be 
obtained from slow protocols associated with small dis- 
sipation such that (w) « AF, i.e., for almost reversible 
processes in which the system passes through a succes- 
sion of quasi-equilibrium states Q. For fast switching, 
the finite sampling error becomes substantial both con- 
cerning the bias and the variance. The respective behav- 
ior in the larg e sampling limit was investigated in several 
studies [1-Qjf. 

An alternative strategy to determine free energy dif- 
ferences is based on the Crooks relation [13 , 



Dividing both sides of the Crooks relation (|2|) by the 
factor of 1 + eP( w ~ AF \ one obtains 



Pf (w) = e-^ AF - w) Pb {-w) 



(2) 



which allows to infer AF without the average process 
required in Eq.([T]). Here the two work PDFs Pf(w) and 
Pb(w) refer to the original, or forward (/) protocol and to 
the backward protocol (b) which is started in the associ- 
ated equilibrium state, i.e., at the initial temperature of 
the forward process and at those parameter values that 
have finally been reached in the forward process and re- 
traces its parameter values. The free energy change can 
be read off from a plot displaying the functions p/{w) 
and Pb(—w) as the work value at which the two distribu- 
tions cross each other. A reliable estimate of the crossing 
point requires sufficient sampling of work with w < AF 
for the forward and respectively with w < —AF for the 
backward process. 

It is worthwhile mentioning here that the average work 
is never smaller than the free energy change, i.e. (w) — 
AF > 0, as a consequence of Jensen's inequality, stating 



that 



-pw\ 



> e 



Hence, the work applied to the 



system in an isothermal process is at least as large as the 
change of free energy, in accordance with the second law 
of thermodynamics. The amount of work that is equal to 
the free energy change can be applied to the system in an 
isothermal reversible process, while any surplus (w) — AF 
is "dissipated work". The dissipated work can also be 
interpreted in terms of the entropy change of the total 
system including the heat- bath, (w) — AF = TAStot [H| , 
which, again is positive in accordance with the second 
law. For rapid protocols, the dissipated work becomes 
large and therefore those realizations of the process with 
w < AF may become extremely unlikely, such that even 
large samples obtained from experiments or numerical 
investigations may result in an unpopulated no man's 
land between the forward and the backward work PDFs 
and, hence, in an unreliable estimate of the free energy 
change based on Crooks' crossing criterion. 



dw 



p f (w) 



I + e P(w-AF) 



dw Pb{ W ) /g\ 

1 + e -P(w-AF) v ' 



This equation was originally suggested by Bennett |l4| 
as an efficient basis to estimate partition function ra- 
tios by means of Monte Carlo sampling. In Bennett's 
derivation, the Fermi-function-like weig hts in Eq. © re- 
sulted as acceptance ratios from the requirement of min- 
imal variance of the partition function estimator in the 
large sample size limit. From the Gaussian assumption 
which this argument underlies one would be let to believe 
that the Bennett estimator yields minimal variance only 
if the overlap region of the forward and backward PDFs 
Pf(w) and pp(—w) is substantially populated. However, 
Shirts et al. [15[ demonstrated that Bennett's acceptance 
ratio method always yields a maximum likelihood esti- 
mate of the free energy change by the use of forward and 
backward data. As such, it allows to extract AF with 
the smallest variance compared to any other free energy 
estimator, even if the two PDFs, Pf(w) and Pb(— w) do 
not overlap. Therefore, the estimation of the free energy 
change from Eq. (3) is less restricted than the Jarzyn- 
ski method and Crooks' crossing criterion as has been 
confirmed in various recent studies fl6l - fl8l |. 

In this work, we investigate the statistical behavior 
of the free energy change estimation by the Bennett 
method, discuss its limitations imposed by the differ- 
ence of the amounts of dissipated work in the forward 
and backward protocol, and compare it with the meth- 
ods proposed by Jarzynski and Crooks. In our discus- 
sion we lay particular emphasis on the practical lim- 
itations resulting from the noncquilibrium nature that 
is imposed on the system by time dependent perturba- 
tions. We begin in Section II with a brief review on how 
the dissipated work and the time asymmetry constrain 
the Jarzynski and Crooks method. In the following Sec- 
tion III, we shortly review the maximum likelihood ar- 
gument [THj leading to Eq. ([3]) and examine its solutions 
in some limiting cases. In particular when the Jarzynski 
and Crooks methods are both hampered by large dissi- 
pated work, we show that the Bennett method simplifies 
to the "1/2-formula" expressing the free energy change 
as the arithmetic mean of the Jarzynski estimation ob- 
tained from the forward and the backward process. In 
Sec. IV, we take as an example a Gaussian chain, and 
consider processes in which work is done by extending or 
compressing the chain at a constant speed. Reliable esti- 
mations of AF based on Crooks' crossing criterion or the 
Jarzynski estimator do not exist if the dissipated work 
becomes large. We present regions in a pulling-speed 
versus sample-size plane, in which the errors of these es- 
timators are smaller than fcsT. For the Gaussian chain, 
the dissipated works in the forward and backward pro- 
cesses are the same, and, as a consequence the Bennett 
method gives precise estimations of AF. In Sec. VI, we 
consider a chain of monomers interacting via pairing po- 
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given by 
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(<w>-AF)/k B T 

FIG. 1: The probability Pf* to obtain a work smaller than 
the free energy change in a realization of the forward process 
is displayed as a function of the dissipated work {w) f — AF = 
/3a 2 1 '2 for a Gaussian process as given by Eq. (7) . Almost 
reversible protocols with small dissipated work give a proba- 
bility Pj? close to the maximum 1/2. In the opposite limit of 
large dissipated work P^~ becomes exponentially small. 



tentials to form a hairpin-like structure. Keeping one end 
fixed and pulling at the other end with constant speed 
gives rise to a non-Gaussian work distribution with dif- 
ferent dissipated works for the forward and the backward 
protocols. The difference increases with growing pulling 
speed and leads to a finite bias of the Bennett estimator. 
The results arc summarized in Sec. V. 



II. GENERALITIES 

We consider a time dependent perturbation of a sys- 
tem in terms of a control parameter X(t) that varies in 
time t according to a prescribed protocol. The protocol 
can be performed in bidirectional way. For the forward 
protocol during a time interval t £ (U,tf), the system 
departs from its initial thermal equilibrium state at tem- 
perature T and control parameter A(t^) and reaches a 
terminal state with X(tf), which, in general, is a nonequi- 
librium state. The work done during this process will be 
referred to as the forward work. For the backward proto- 
col, the system starts out in thermal equilibrium at the 
same inverse temperature j3 with the control parameter 
at X(tf). In this case, the parameter retraces its values 
in time- reversed order and finally reaches X(U) as ter- 
minal parameter value. After repeating infinitely many 
times both protocols, two probability distribution func- 
tions, Pf(w) for the forward and Pb(w) for the backward 
protocol, can be assembled. 

As mentioned in the introduction, the probability for 
observing work less than the free energy change is the 
crucial factor for the applicability of the Crooks' cross- 
ing criterion. For the forward process this probability is 



AF 



dwpf(l 



and, accordingly, for the backward process, by 



dwpb(w). 



(4) 



(5) 



If these probabilities are too low to get a sufficient num- 
ber of realizations of work below the free energy change 
within a reasonable total number of experiments, the 
Crooks' crossing criterion fails to give AF. Moreover, 
a poor sampling of work data below AF will lead to a 
substantial underestimation of the exponential average 
(e~P w ) resulting in a too large Jarzynski estimate of AF. 

As an example, we consider a Gaussian work PDF 
Pf(w) for the forward process. As a consequence of the 
Crooks relation ([2]) the backward work PDF pb{w) is 
also Gaussian with the same variance a 2 . Hence the two 
PDFs are: 



Pa(w) 



1 



V2ira 2 



cxp 



(W - (w)q) 2 

2a 2 



a = f,b. (6) 



In this case the free energy change becomes AF = 



pV/2 



(w) b - 



-/3<7 /2. The probability of finding work 



less than the free energy change in the forward protocol 
then becomes 



P< 



r ric (^) ' 



(7) 



where erfc(x) denotes the complementary error function, 
see Fig. 1. If the process is weakly dissipative, i.e for 
P({w) f - AF) = (/3er) 2 /2 < 1, the forward work distribu- 
tion becomes narrow and the argument of the error func- 
tion is small. Because of erfc(x) ~1- 2x/y/n for i<1, 
the probability Pf approaches the value 1/2 in the limit 
of vanishing dissipated work [Hi ]. In other words, on av- 
erage, every other measurement probes a work that is less 
than the free energy change (Fig. 1). In the opposite limit 
of a strongly dissipative process being characterized by 
(w) / — AF ^ /3 _1 , the work PDFs become broad because 
then fia ^> 1. Using the asymptotic behavior of crfc(x) 
for x 3> 1, erfc(a;) ~ e~ x /(^/ttx), we obtain an expo- 
nentially small probability ~ exp(— /3 2 er 2 /8)/(/3er) to 
find a work in the forward protocol that is less than the 
free energy change. 

A measure quantifying the difference between the for- 
ward and the backward work PDFs Pf(w) and pb(—w) 
based on the Jensen-Shannon divergence of the two dis- 
tributions is given by the so-called time asymmetry [20l — 
reading 
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FIG. 2: (Color online) (a) The time asymmetry A defined by 
Eq. (6) for a Gaussian PDF monotonicly varies as a function of 
the hysteresis h = ((w) j + (w)t)/2 — j3a 2 /2 from vanishingly 
small values at small ft to a maximal value at large h. Note 
that the hysteresis h gives the average dissipated work for 
the forward and the backward protocol, and, for a Gaussian 
work distribution, characterizes the separation of the peaks 
of the forward and backward work pdfs Pf(w) and Pb(— w), 
respectively. The PDFs at the values indicated by the arrows 
in panel (a) are displayed in the panels (b) and (c). (b) For 
weak dissipation ((/3a) 2 = 1), the time asymmetry is small 
(A ~ 0.1) and the work PDFs display a large overlapping 
area near AF. (c) For strong dissipation (/3cr 2 = 100), the 
time asymmetry approaches its maximum (A — In 2) and the 
work PDFs are well-separated. 



where (X(w))f = J dwpf(w)X(w) and (X(w))b = 
J dwpb{w)X(w) denote averages with respect to the for- 
ward and the backward distributions, respectively. This 
quantity vanishes when Pf(w) and pb(—w) agree with 
each other and otherwise falls in the range < A < In 2. 
Therefore, if the measured work values are mostly pop- 



ulated in the close vicinity of the free energy change, 
AF for the forward protocol and — AF for the backward 
protocol, the time asymmetry approaches its minimum 
value A = 0. In this case, Pf(w) and pb(—w) have a large 
overlap. On the other hand, the time asymmetry reaches 
its maximum In 2 when Pf(w) and pb(—w) are perfectly 
separated by a large gap between the respective regions 
with w 3> AF for the forward protocol and w — AF 
for the backward protocol. 

We exemplify these behaviors for Gaussian work PDFs, 
given by Eq. ^ : Figure 2 (a) displays the time asymme- 
try as a function of the peak separation between p/(w) 
and Pb(—w), that is as a function of the hysteresis h de- 
fined as 12011 



1 



(9) 



For small values of h, the time asymmetry is close to 
zero, and the corresponding PDFs substantially over- 
lap in an area near AF as displayed in Fig. 2(b) for 
(3h = 1/2. In contrast, when h is large the time asymme- 
try approaches its maximum, and seemingly, the forward 
and backward PDFs pj(w) and Pb(— w) no longer overlap 
with each other as exemplified in Fig. 2(c) for j3h = 50. 
Strictly speaking, due to the Crooks relation ^ Pf(w) 
and pb(—w) must share the same support and therefore 
always have a finite overlap, which, however, may be- 
come virtually invisible. Little overlap of the forward 
and the backward work PDFs indicates that the process 
involved in the work generation is highly irreversible, and 
hence, the direction of time flow is unambiguous giving 
the information In 2 of a single bit. Also, for this case, 
large dissipated work is associated with both processes, 
and the probabilities P b K and Pf to find work values be- 
ing less than the free energy change become extremely 
small (Fig. 2(c)). In summary, the more pronounced is 
the direction of the time-arrow in a nonequilibrium pro- 
cess, the rarer it becomes to find events with work smaller 
than the free energy change and the more erroneous both 
estimators by Jarzynski and Crooks turn out to be. 

Finally, we note a relation between the time asymme- 
try and the hysteresis h specifying the average dissipated 
work of the forward and the backward protocol. Because 
the free energy does not change upon completion of a 
cyclic process, the total dissipated work agrees with twice 
the above introduced work hysteresis, 2h. As pointed out 
in Ref. [f|, the hysteresis gives a rough estimate for the 
number of measurements M c required for a relatively re- 
liable estimation of AF based on the Jarzynski equality, 
as M c > e' 9 ' 1 . On the other hand the hysteresis is related 
to the time asymmetry by the following inequality [20j . 



,.fth 



> 



1 



2c 



-A 



1 



(10) 



As a consequence the required number of measure- 
ments becomes infinitely large if the time asymmetry 
approaches the limiting value, In 2. Therefore not only 
the Jarzynski estimator ceases to work but also Crooks' 
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crossing criterion fails for processes with unambiguous 
arrows of time. 



III. BENNETT'S ACCEPTANCE RATIO 
METHOD 

As mentioned in the Introduction, Shirts et al. [HI ob- 
tained the Bennett relation ([3]) by means of the statistical 
concept of maximum likelihood. Here, this approach is 
based on a transformation of the Crooks relation J2]) into 
an expression for the probability P{f\W) with which the 
forward protocol is drawn from an ensemble of equally 
many realizations of both protocols under the condition 
that the work performed on the system is w — W, for 
the forward, and w — —W for the backward protocol. 
This conditional pro bability takes the form of a Fermi 
function, reading [15| 



P(f\W) 



1 



1 + cxp [-0(W - AF)] 



(11) 



The complementary probability P(b\W) = 1 — P(f\W) 
for finding If in a realization of the backward process 
then becomes 



P(b\W) 



1 + cxp \p(W - AF)] 



(12) 



According to the maximum likelihood method, the most 
likely value of the free energy change compatible with 
the work values of M realizations of the forward and 
equally many realizations of the backward protocol max- 
imizes the likelihood defined as the joint probability 
£(AF) = Hf P(f\wjj)P(b\ - w j>b ) evaluated at the ac- 
tual outcomes Wj t0l , j = 1..M, of the forward, a = /, and 
the backward, a = b protocols. This leads to 



M 

— Y- 

M^l 



1 



M ^— ' 1 + e^ w ^f- AF "> M 4— 1 1 + e P(wj. b +AF) ■ 
j=i i=i 



1 M 

— Y- 

M ^ 1 



1 



(13) 



presenting a non-linear equation in AF which, for given 
data Wj t and Wj t,, has a uniquely defined solution. It can 
numerically be solved by means of the Newton algorithm. 
In the continuum limit of M — > oo this equation can be 
written as 



dw 



p f (w) 



1 + e Kw-^F) 



dw 



p b {w) 



I + e 0(w+AF) ■ 



(14) 



yielding Eq. ([3]) upon a change of the variable w — > —w 
on the right hand side. 

In the limiting cases of slow and rapid protocols, 
Eqs. ([13]) and (fl"4"]l have simple solutions. When the 
work protocol is performed quasi-statically so that the 
associated dissipation is small, the majority of work val- 
ues are localized near the free energy difference such that 
\w - AF\ < k B T for the forward, and \w + AF\ < k B T 



for the backward protocol. The expansions of the expo- 
nential factors in Eq. (|14[) in terms of their small argu- 
ments lead to 



dw(w — AF)pf(w) 



dw(AF — w)pb(-w) 
(15) 



yielding for the free energy change 

(w) f - (w) b « 2AF. 



(16) 



For a slow work protocol with small dissipations, the Ben- 
nett estimate of the free energy difference is given by the 
difference of averaged works of forward and backward 
protocols. 

On the other hand, for a fast protocol generating large 
dissipation, the overwhelming majority of forward data 
satisfies Wjj — AF 3> fceT and accordingly wj^ + AF ^> 
ksT for the backward data. Then, we can expand the 
Fermi-functions in Eq. (|13j) giving in leading order 



C 20AF M 

M ^ 



M 



3=1 



This yields a central result of this work 

2AF w AFf - AF b = 2AF H , 



(17) 



(18) 



where the Jazynski estimates AF a are determined from 
M realizations of the forward (a = f) and the backward 
(a = b) protocols as, 



pAF a 



In 



1 M 

— Y< 

3 = 1 



(19) 



The Bennett estimate AF in the large dissipation regime 
is given by half of the difference of the forward and the 
backward Jarzynski estimates, AFf — AF,, which we de- 
note as AFh- 

Due to the finite sampling of work values, the esti- 
mates AF Q are still random and hence can differ from 
experiment to experiment. In order to characterize the 
statistics of the Jarzynski estimator for a finite number 
M of work data, one considers m repetitions of M work 
measurements. The totality of work data then consists 

of {[v)j ], [«^], ■ • • , [if]"^]} where [wj ] is the data set 

from the £-th experiment: [w^] = {wf\ w^i ' ' ' w m}- 
Based on the resulting set of the Jarzynski estimates for 
M data according to Eq. (fTT)|) the statistics of these esti- 
mates can be analyzed. In particular, one obtains a block 
averaged estimate of the free energy difference from m 
experiments reading 



^ m 

f3AF^ = - lim — V In 

m— >oo m — ^ 



(=1 



i M 



3=1 



(20) 



It was proven that the block averaged Jarzynski estima- 
tor, AF Q , with a finite M is bounded from below by the 
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true value of AF and from above by the average work 
(w) a , i.e., it lies in the range [ill |23| 



AF < AF a < (w) 



(21) 



Unfortunately, neither a lower nor an upper bound of 
the Bennett estimator is known. The underlying diffi- 
culty results from the structure of the Eq. (fl~8|) . which 
is based on the difference between the forward and back- 
ward Jarzynski estimates. Therefore the inequalities (|21[) 
valid for the individual terms do not translate to the Ben- 
nett estimation in the large dissipation limit. 



IV. GAUSSIAN CHAIN 

We illustrate the aforementioned features by consider- 
ing a one-dimensional Gaussian chain of (N + 1) beads 
connected by harmonic springs. Additionally, the beads 
experience strong friction and fluctuating forces stem- 
ming from a heat bath at temperature T. The potential 
energy of the system is given by 



JV 



(22) 



i=l 



where Xi denotes the position of the ith bead; the first 
bead with i = is fixed at the origin (xq =0). A sim- 
ple way to perform work on the system is to pull the 
last bead (i = N) with a constant speed. The forward 
protocol acts during a time interval t £E (0,t/), starting 
from the thermal equilibrium state of the chain with its 
end at the origin, xn(0) = 0, and proceeds by increasing 
XN(t) linearly in time as XN(t) = vt until the chain end 
reaches the designated position xat(£/) = x&. For the 
backward protocol, the chain is initially in the thermal 
equilibrium while its end is fixed at Xd and the work is 
done by changing the chain end position as x(t) ~ x^ — vt. 
The Jarzynski work for the forward protocol is given by 



dt 



dU({ Xi }) 



dx 



= vk 



N 



dt[xN(t) - XN-l(t)], 



(23) 

and the same expression with v — > —v gives the work for 
the backward protocol. The overdamped motion of the 
beads (i = 1, 2, • • • , N — 1) with the friction constant 7 
can be described by a position Langevin equation: 



7" 



dxi 
dt 



-k(x i+1 + - 2x ; ) + &(*), (24) 



where the Gaussian white noises £j (t) model thermal ran- 
dom forces exerted on the chain. Accordingly, they sat- 
isfy <&(*)> = and (f)> = 2 1 k B T5 i , j 5{t - t'). 
Eq. (j2"4"]l describes the Rouse model of a polymer in a 
viscous fluid where the hydrodynamic interactions are 
neglected. 

This model is exactly solvable as shown by Dhar [24j ]. 
The free energy difference associated with stretching in 




FIG. 3: (Color online) (a) The work hysteresis h as defined 
in Eq. <(9j is displayed as a function of the time scale ratio 
tr/tf representing the pulling speed, for AF = 15fcsF, and 
for different lengths of the chain, N — 40 (•) and N = 10 
(A) in the regime of slow pulling speeds. Note that h/kbT 
agrees with the work variance of the Gaussian chain. For 
vanishing pulling speed {tr/tf — > 0) the variance vanishes. At 
higher pulling speeds, the chain-length dependence is more 
pronounced. The panel (b) shows the deviations of the block 
averaged Jarzynski estimator as defined in Eq. (|20[) from the 
exact value of the free energy change as a function of pulling 
speed. The data were sampled from the exactly known Gaus- 
sian work distributions given by the Eqs. ([6j and (|26[) . rather 
than by simulations of the Langevin equations (|24[) . The sam- 
pling size was chosen as M — 10 4 (o for N = 40 and A for 
N = 10) and M = 10 6 (• for N = 40 and A for N = 10). 
In both cases the number of m = 3 x 10 2 blocks led to well 
converged block averages. As the pulling speed increases, the 
error due to the finite sampling size becomes significant. Only 
the protocol taking much longer than the relaxation time of 
the system (t r /tf < O(10~ )) yields an error less than ksT. 
In both panels, the lines connecting the symbols are guides 
to the eye. 



the forward process is given by 

AF=f|. (25) 

and accordingly — AF for the backward process. The 
probability distribution of the work is of Gaussian form 
as given by the Eq. (EI, with the forward work average 
and the variance 1241 . 



( w )f 



AF 



a' 



-Ct 



(26) 

l)/T]jV-l,iV-lj 



where 

Sij—i, and r 



£ is the lattice Laplacian, Cij = 2Sij — — 
(kf'y)tf. For the backward protocol, the 
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FIG. 4: (Color online) The Jarzynski bias as a function of 
the time asymmetry for N — 40 for pulling protocols with 
AF — 15fcsT. The duration of time in which the final end to 
end distance was reached was chosen as tf = 2~ n 10 3 i r with 
n varying from n = up to n — 15 for different block sizes, 
M = 10 4 (▼) and M = 10 6 (•). Avera ges are performed 
over m = 3 x 10 2 blocks. Irrespective of the block size M, 
the bias remains within ksT when the time asymmetry is 
less than In 2 (represented by the vertical dotted line) , and 
abruptly rises in the limit of maximal time asymmetry, A — ► 
In 2. The broken and solid lines connecting equal symbols 
serve as guides to the eye. The vertical bars at the symbols 
indicate the according variances of the Jarzynski estimator. 

PDF is also Gaussian with the same variance, a as given 
in Eq. (f2l>|) . and the work average 

( w ) b = -AF+^a 2 , (27) 

indicating that the free energy difference for the Gaussian 
distribution is determined by the difference of the for- 
ward and backward work averages: 2AF = (w) / — («;}&. 
On the other hand, the hysteresis, defined by Eq. (j9|) be- 
comes h = /3cr 2 /2. For a given final extension the pulling 
speed is a crucial factor in determining the value of a, 
as it appears in the prefactor. According to Eq. (|26|) . 
for t —> 0, i.e. for a sudden change of the end-to-end 
distance, the variance approaches a finite value which is 
almost independent of the chain length (24]]. With in- 
creasing duration of the protocol the work variance be- 
comes dependent on the chain length, indicating a col- 
lective response of the chain. In the limit of infinitely 
slow processes the variance asymptotically vanishes as 
1/t, as one would expect for an isothermal quasi-static 
protocol. "Slow" and "fast" can be quantified relative 
to characteristic time scales of the system, for example, 
to its relaxation times. For Gaussian chains considered 
here the relaxation time is given by t r — j/{k\ m ) with 
A m being the minimum eigenvalue of the negative lattice 
Laplacian C: A TO = 2 — 2 cos [n/ N]. For N ^> 1, the relax- 
ation time increases quadratically with the chain length. 
Since we are interested in the efficiency of different esti- 
mators of free energy change we compared these for two 
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FIG. 5: (Color online) The finite sampling behavior of the 
Jarzynski estimator for AF = 15fcsT, and various progress 
rates of the protocol, t r /tf, as a function of the sampling size 
M. Averages are performed over m = 10 4 blocks. The es- 
timated free energy difference shows the monotonic behavior 
predicted by Eq. Q20p . and is larger than the true value for any 
finite M in agreement with Eq. (|21|) . The lines represent the 
large- M asymptotic behavior, Eq. (|28p . Only the protocols 
with slow pulling speeds for t r /tj = 1/30 and t r /tf = 0.05 
display this asymptotic behavior. In order to see it for faster 
pulling speeds one has to go to even larger sample sizes. 

different chain lengths and different protocols, in all cases 
leading to the same change of the free energy. Accord- 
ing to Eq. (|2"5)) the final extension then depends on the 
chain length as xa oc y/~N for large N. The ratio of the 
two time scales, t r /tf = v/(X m y / 2NAF/k) then quanti- 
fies the progress rate of the protocol: In particular, for 
U/tf <C 1, the protocol approaches a reversible processes 
with vanishing dissipated work. 

For given AF and t r /tf, the exact value of the vari- 
ance a can be found using Eqs. (|25j) -(|27 |) . which com- 
pletely characterize the Gaussian work PDF given by 
Eq. ^ [24|]. This Gaussian distribution indicates the 
ideal work PDF which one would obtain from an infinite 
number of measurements, leading to unbiased results of 
the estimators. In order to address the statistical bias 
for a finite number of data, we investigated the depen- 
dence of various free energy estimators on t r /tf and on 
the sample size by drawing data from the exactly known 
Gaussian work distribution given by the Eqs. ©: In this 
way we could avoid time-consuming simulations of the 
Langevin equations ([24| and yet obtain large amounts of 
data with low computational effort. 

First, in Fig. 3(a), we present results for the hystere- 
sis h as a function of t r /tf. From the value of h at a 
given tr/tf, one can estimate the required number of 
measurements for a reasonable estimate of AF by the 
Jarzynski equality according to M c > e^ h . For example, 
for U/tf = 0.1, M c ~ e 10 - 10 4 . If the chain is pulled 
faster, M c may become enormously large so that the free 
energy estimation from a finite number of measurements 
becomes totally unreliable. We used the known work 
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PDF of the Gaussian chain as a test case and determined 
block averages over m = 3x 10 2 blocks of different sizes. 
Figure 3(b) shows the errors in the free energy estimation 
from the forward work measurement, (AFf — AF)/ksT, 
where AFf is obtained via Eq. (|20|) for two different sam- 
ple sizes with M = 10 4 and M = 10 6 . With t r /t f the 
dissipated work increases giving rise to increasing devi- 
ations of the estimated free energy change from its true 
value. 

With increasing time asymmetry the total dissipated 
work grows. For the Jarzynski estimator this growth 
leads to a systematic increase of its bias relative to the 
exact value as displayed in Fig. 4. In particular, the 
bias diverges if the limiting value In 2 of the time asym- 
metry is approached. This divergence persists for any 
finite number of data in accordance with the exponen- 
tial scaling of the required number of data, M c , with the 
hysteresis, M c cx e . The bias of the Jarzynski estimator 
is presented in Fig. 5 as a function of the block size M 
for different pulling speeds. It asymptotically approaches 
zero at different rates depending on the pulling speed for 
large values of M. According to Ref. [1(J, [llj the asymp- 
totic approach is given by a power law reading: 

(AF~-AF)/k B T= (/ a2 -l)/(2M) + 0(A/- 2 ). (28) 

This behavior though can only be identified for the two 
cases with slowest pulling speeds t r /tf = 1/30 and 
t r /tf = 0.05 (See the straight lines in Fig. 5). For the 
other, faster protocols, the sample sizes examined here 
are insufficient to enter the asymptotic regime being gov- 
erned by the central limit theorem. Instead, we observe 
algebraic decay behavior as 1/M a with a < 1. For the 
sample sizes studied here, the decay exponent a becomes 
very small with increasing pulling speed, signaling a bad 
convergence of the finite sample average for fast proto- 
cols. 

Our next goal is to systematically compare the biases 
of three estimators, the Jarzynski equality, Crooks' cross- 
ing criterion, and the Bennett method, for the Gaussian 
work PDF as exemplified by the pulling protocol of the 
Gaussian chain. In Fig 6(a), we display the absolute mag- 
nitude of the biases of the three estimators as functions 
of t r /tf and sampling size M. As expected, the Ben- 
nett method is always superior to the other estimators. 
The symbols J, C and B in Fig. 6 (a) indicate those pa- 
rameter regions within which the Jarzynski, Crooks and 
Bennett estimators, respectively, deviate by less than the 
thermal energy fc#T from the true free energy difference. 
The Jarzynski estimator soon becomes unreliable with in- 
creasing pulling speed, while the Crooks estimator yields 
better values for larger pulling speeds provided that it 
is based on a sufficiently large number of data. It is in- 
teresting to note that for small M and very slow pulling 
speeds the Jarzynski estimator performs better than the 
Crooks estimator. To better understand this observation, 
note that already for M = 1, the block average (f2"6")l 
yields AFf = (w) * which is close to the correct value for 
weakly dissipative protocols, according to Eq. (|2"0"|) . On 
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FIG. 6: (Color online) (a) Efficiency diagram of the three 
methods for N = 40 and AF = 15kBT. Each sample of size 
M was independently generated m = 10 4 times. The sym- 
bols J, C, and B are assigned to those regions in the param- 
eter plane spanned by the inverse box size and t r /tf, where 
the Jarzynski, Crooks, and Bennett methods, respectively, 
give an estimate of the free energy difference with an error 
being less than fcsT. Along the line connecting the squares, 
the expected error of the Jarzynski estimator equals feT, i.e., 
AFf — AF = ksT. Below this line, for smaller pulling speeds, 
the error is smaller. The line connecting the circles indicates 
the corresponding curve for the Crooks estimator, (b) Esti- 
mated bias of three estimators of free energy difference as a 
function of t r /tf with M = 10 4 and m — 3 x 10 2 . The Jarzyn- 
ski error becomes larger than ksT as soon as t r /tf > 0.1. The 
Crooks crossing criterion gives the correct value of AF up to 
tr/tf ~ 0.2, where the overlap between the forward and the 
backward PDF disappears so that the Crooks criterion only 
delimits the range within which the free energy difference is 
located. Apart from an increase of the expected sampling er- 
ror, Bennett's method according to Eq. (|13p is insensitive to 
the pulling speed, yielding precise estimates of AF. The error 
bars indicate the magnitude of the variance of the free energy 
estimators based on m = 3 x 10 2 blocks of size M = 10 4 . The 
variance of the Jarzynski estimator increases rapidly with in- 
creasing speed, while the variances of the Crooks and the 
Bennett estimators grow much slower. 



the other hand, for the Crook's crossing criterion to prop- 
erly work, the tails of the forward and backward PDFs 
need to be sampled with sufficient accuracy, necessitat- 
ing a sample of reasonable size. The Bennett estimator 
for this model with a Gaussian work PDF is found to 
be free of any bias in the whole investigated parameter 
region. Yet with increasing pulling speed the variance 
of the Bennett estimator increases as displayed in panel 
(b) of Fig. 6 for a fixed M. This increase though is less 
pronounced than those of the respective variances of the 
Crooks and the Jarzynski estimators. 
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FIG. 7: (Color online) A schematic picture of a hairpin chain 
with N = 20. The pairing interactions are present between 
the i-th and N—i-th monomers where i = i c , i c +l, • • • , i c +Nb 
with Nb + 1 denoting the number of pairs. In this work, we 
considered a hairpin chain with iVj = 3. 



V. CHAIN WITH A HAIRPIN 

A particular feature of a Gaussian work distribution 
given by Eq. (J6j) , such as the one for the pulling process of 
a Gaussian chain, is the mirror-symmetry of the forward 
and backward distributions with respect to AF relat- 
ing Pf(w) and pb(—w). As a consequence, the dissipated 
works are equal for the forward and the backward pro- 
cesses, and, moreover, the Bennett estimator is unbiased 
(See Fig. 6(b)). In order to demonstrate the dependence 
of the Bennett estimator on the difference of the forward 
and backward dissipated works, we thus need to consider 
an asymmetric work PDF. As shown in the first experi- 
mental realization of the Jarzynski equality , pulling a 
hairpin molecule typically exhibits pronounced asymmet- 
ric work PDFs, depending upon the direction of the pro- 
tocol. Therefore, we consider a chain in three dimensions 
consisting of N monomers with Hookcan bonds along the 
chain and, additionally, pair-specific interactions lead- 
ing to a hairpin structure in the mechanical equilibrium 
state, as depicted in Fig 7. The potential energy of this 
chain is given by 



=i 
E 



iV 

'tV- 

=1 



a? 



/ \ 12 




/ a \ _ 


( " ) 
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\n,N-i J 



(29) 



where r 4 = (x^y. 
monomer, and r. 



i) denotes the position of the i-th 
j j — |rj — Tj\ the distance between two 
monomers. The first sum represents the contribution of 
the Hookean springs connecting neighboring monomers 
along the chain. Here a is the equilibrium bond length 
and k the spring-constant. The second sum describes 
the interaction between pairs of monomers i and N — i, 
given by the Lennard-Jones potential. The summation 
index runs over the monomers constituting the pairs as 
depicted in Fig 7. The total number of pairs is Nf, + 1 . As 
for the Gaussian case, additionally, strong friction and 
random forces act on the beads of the chain resulting 
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FIG. 8: (Color online) Histograms of the work performed on 
a hairpin chain as depicted in Fig. 7. The chain parameters 
are N = 20, k = 30 and e = 20; the maximal end to end dis- 
tance of the chain reached at the end of the forward protocol 
is Xd = 25a; for further details see the text below Eq. (|30f) . 
Each histogram was obtained as an average over m — 10 
independently generated histograms based on M = 10 4 sim- 
ulations of the Langevin equation (|30|) . The variances of the 
such estimated probabilities indicated by error bars are rel- 
atively small. On the right hand side the histograms of the 
forward protocols and on the left hand side those for neg- 
ative work of the backward protocols are depicted. Panel 

(a) displays the case when the protocol lasted tf = 900^"° 
representing a relatively slow pulling speed for which the for- 
ward and backward work histograms overlap. Note that here 
tr D is the relaxation time of a three-dimensional Gaussian 
chain, t%P = t r /3 — j/(3k\ m )- Consequently a reliable es- 
timate of AF — 89.4fcflT indicated by the vertical dotted 
line was obtained from the Crooks crossing criterion. Panel 

(b) gives the work distributions for a five times faster pulling 
speed (tf = 180tf. D ). At this higher speed the histograms be- 
come broader and more dissimilar and no longer overlap with 
each other. 



in an overdamped dynamics which is governed by the 
Langevin equation 



7^ = -V i C7({r i })+&(t), 



(30) 



where 7 denotes the friction constant experienced by 
a monomer, Vj the three dimensional gradient with 
respect to the position of the ith monomer, and 
= {€i,x(t),£i,y(t),£i,z(t)) thermal Gaussian white 
noise forces satisfying (&,<(£)) = and (€i,e(t)£j,m{t')) = 
2^kBTSijdi tm 5(t — t'). Similarly as for the Gaussian 
chain, the one end of a hairpin chain is fixed at ro = 
and the other end rjv is pulled with constant speed v 
in the x-direction. As for the Gaussian chain, the work 
performed in this way is given by v J Q / dtdU({ri})/dxN 
with the hairpin potential, Eq. (f2"9")l . 

Unlike the case of pulling a Gaussian chain, for a pulled 
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hairpin chain, we do not know the exact work distribution 
from which we could sample the data. In order to study 
the effect of an asymmetric work PDF on the biases, we 
had to perform direct simulations of the Langevin dy- 
namics. For the numerical simulations, we rescaled all 
lengths by the bond length according to f = r/a and dis- 
cretized the Langevin equation with a time step A. The 
iterative Langevin equation then reads in terms of the 
discrete time variable n = t/A as 

ii(n + 1) = r<(n) - j!V t U({r(n)}) + £», (31) 

where U = U/ksT is the rescaled potential, and jl = 
A/t a denotes the rescaled mobility with t a = 7a 2 / (ksT) 
being a characteristic time-scale of the chain. The vari- 
ances of the dimensionless Gaussian random forces £ (n) 
are determined by (ii,k(m)£j/(n)) = 2p,8m,n^k,A,3- The 
dimensionless potential U is determined by the dimen- 
sionless spring constant k = /ca 2 /fcsT and the dimen- 
sionless energy parameter of the Lennard- Jones potential 
e = e/ksT. In the simulations these parameters were set 
as k = 30 and e = 20. For the efficiency as well as the 
numerical convergence of the simulations, the rescaled 
mobility was chosen as £t = 0.0001. We considered a 
hairpin-like molecule with N=20 monomers having pair- 
ing potentials between the monomer pairs (4, 16) up to 
(7,13). 

Before the forward protocol was started, the chain had 
been prepared into a thermal equilibrium state having 
the form of a hairpin with constrained positions of the 
first and last monomers, i"o = 0, r^(0) = ax, respec- 
tively, where x denotes the unit vector in cc-direction. 
This thermal equilibrium state was established by sim- 
ulating the Langevin equation pip for a sufficiently 
large time with clamped end positions. Upon equilibra- 
tion, the last monomer was pulled at a constant speed 
v in the .T-direction until it had reached the distance 
Xd = |r.y(£/) — r/v(0)| = 25a. The backward protocol 
was started with a thermal equilibrium distribution with 
the first and last monomer at the final positions of the 
forward protocol; then the chain was compressed by mov- 
ing the last monomer at the same absolute velocity v in 
the — x-direction until it had reached the initial position 
of the forward protocol, r^r = ax. 

The histograms displayed in Fig. 8 represent averages 
of m = 10 2 raw histograms each based on M = 10 4 sim- 
ulations of the Langevin equation for forward and back- 
ward protocols at two different pulling speeds. The sta- 
tistical uncertainty of these averages were estimated from 
the variance of the distributions of raw histograms and 
are indicated by error bars. For the slow protocol dis- 
played in the upper panel, the forward and the backward 
histograms cross at AF = 89.4 indicated by the vertical 
dotted line, and appears as almost symmetric about the 
crossing point. In contrast, for the fast protocol shown 
in the panel (b), the two histograms do not overlap, and 
therefore, do not allow to extract AF from the Crooks re- 
lation. Moreover, these histograms are no longer mirror 



symmetric; the forward histogram is significantly more 
dispersed than the backward histogram. We also present 
the biases of the free energy estimators in Fig. 9(a). As 
the protocol speeds up, the bias of the Jarzynski esti- 
mators (open squares) becomes more pronounced and, 
at the same time, the dissipated work (triangles) grows. 
We note that the 1/2-formula (open diamonds) given by 
Eq. (jT8")l perfectly coincides with the Bennett estimation 
(filled red triangles) . Both methods lead to a significantly 
smaller bias than the Jarzynski estimate. 

Unlike the Gaussian case, the dissipated work is larger 
during the forward than during the backward process; see 
the curve for (w) / — AF in comparison with the curve for 
(w)b + AF in Fig. 9(a). Figure 9(b) presents the bias of 
the Bennett estimator as a function of the difference be- 
tween these dissipated works. The monotonic increase of 
both quantities as functions of the time-ratio t r /tf leads 
to a proportionality between the bias of the Bennett esti- 
mator and the difference between the dissipated works in 
the forward and backward protocols confirming the pre- 
vious conjecture. The influence of the difference between 
the dissipated forward and backward works on the Ben- 
nett estimator can be understood qualitatively, at least 
in the limit of large dissipation. According to Eq. (fl"8|) , 
in this limit, the Bennett estimator can be expressed by 
half of the difference of the Jarzynski estimators for the 
forward and the backward protocol. For a Gaussian work 
distribution the forward and the backward distributions 
Pf(w) and pb(—w) are symmetric with respect to AF and 
consequently, the biases of the two estimators are identi- 
cal and compensate each other in the 1/2 formula. On the 
other hand, in all cases with different dissipated works of 
the forward and the backward processes, the symmetry 
of the forward and backward distributions is apparently 
lost and therefore the forward and the backward Jarzyn- 
ski estimators will have different biases, which then no 
longer compensate each other in the 1/2 formula. 



VI. SUMMARY 

In summary, we discussed the statistical behaviors of 
the free energy estimators based on the Jarzynski equal- 
ity, Crooks' crossing criterion, and Bennett's acceptance 
ratio method, in relation to the amount of dissipated 
work and the time asymmetry. In particular, we inves- 
tigated the limiting behaviors of the solutions of Eq. ([3]) 
and demonstrated that while both the Jarzynski and the 
Crooks method are hampered by large dissipated work, 
the finite sampling error of Bennett's method is deter- 
mined by the difference of the dissipated works of the 
forward and the backward processes. As a consequence, 
it is less severely influenced by the entropy production 
but rather by the asymmetry between the forward and 
the backward process. The examples of a Gaussian and 
a non-Gaussian chain considered here demonstrate these 
features. The finite sampling error of the Jarzynski es- 
timator rapidly increases with the amount of dissipated 
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FIG. 9: (Color online) (a) The bias of the free energy esti- 
mators for a hairpin chain with parameter values specified in 
Fig. 8 is depicted as a function of the inverse duration of the 
pulling protocol t / . The statistical error bars are smaller than 
the size of symbols. With increasing speed, the Jarzynski bias 
and the dissipated work of the forward and the backward pro- 
cess grow fast. At the same time but at a much slower rate 
also the difference between the forward and the backward 
dissipated work as well as the bias of the Bennett estimator 
increase, (b) The Bennett bias is shown to be proportional to 
the difference between dissipated works, (w d ) $■ = (w) $■ — AE 
in the forward process and (w d )b = ( w )b + AE in the back- 



work. The Crooks estimator has a binary character: ei- 
ther it yields a reliable solution, as long as the forward 
and the backward work PDFs overlap, or provides no 
solution at all. At best an upper and a lower bound 
can be estimated from the extension of the gap between 
Pf{w) and pb(—w). For Gaussian work PDFs the Ben- 
nett method leads to precise estimates over a remarkably 
wide range of pulling speeds and sampling sizes. For non- 
Gaussian work PDFs, the bias of the Bennett estimate 
is still smaller than for other estimators. It, however, in- 
creases with increasing differences between the forward 
and backward dissipated works. 



Finally we like to emphasize that in all cases where the 
Jarzynski and Crooks estimators fail, the determination 
of AF can be substantially simplified by means of the 
1/2-formula, Eq. ([T8| . expressing the free energy differ- 
ence as half of the difference of the forward and backward 
Jarzynski estimates. 



ward process. 
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